Machine learning in ScalOps, a higher order cloud computing language

نویسندگان

  • Markus Weimer
  • Tyson Condie
  • Raghu Ramakrishnan
چکیده

Machine learning practitioners are increasingly interested in applying their algorithms to Big Data. Unfortunately, current high-level languages for data analytics in the cloud (e.g., [2, 15, 16, 5]) do not fully cover this domain. One key missing ingredient is means to express iteration over the data (e.g., [19]). Zaharia et al., were the first to answer this call from a systems perspective with Spark [20]. Spark is built on a data abstraction called resilient distributed datasets (RDDs) that reference immutable data collections. The Spark domain-specific language (DSL) defines standard relational algebra transformations—selection, join, group by, etc.—and mechanisms to cache RDDs in memory. The Spark runtime is optimized for in-memory computation and has published speedups of 30× over Hadoop MapReduce for many machine learning and graph algorithms, which is in line with the gains found using MPI or even special case implementations (e.g. [18]).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment Methodology for Anomaly-Based Intrusion Detection in Cloud Computing

Cloud computing has become an attractive target for attackers as the mainstream technologies in the cloud, such as the virtualization and multitenancy, permit multiple users to utilize the same physical resource, thereby posing the so-called problem of internal facing security. Moreover, the traditional network-based intrusion detection systems (IDSs) are ineffective to be deployed in the cloud...

متن کامل

Cloud Computing; A New Approach to Learning and Learning

Introduction: The cloud computing and services, as a technological solution for developing educational services, can accelerate the provision and expansion of these highly useful services. This study intended to provide an overall picture of practical areas of learning services based on cloud computing teaching and learning equipment. Methods: This was a theoretical hybrid research study in whi...

متن کامل

Optimization Task Scheduling Algorithm in Cloud Computing

Since software systems play an important role in applications more than ever, the security has become one of the most important indicators of softwares.Cloud computing refers to services that run in a distributed network and are accessible through common internet protocols. Presenting a proper scheduling method can lead to efficiency of resources by decreasing response time and costs. This rese...

متن کامل

A review of methods for resource allocation and operational framework in cloud computing

The issue of management and allocation of resources in cloud computing environments, according to the breadth of scale and modern technology implementation, is a complicated issue. Issues such as: the heterogeneity of resources, resource dependencies to each other, the dynamics of the environment, virtualization, workload diversity as well as a wide range of management objectives of cloud servi...

متن کامل

Cloud Computing Application and Its Advantages and Difficulties in the Teaching Process

The objective of this research is to identify the technology of cloud computing in terms of its concept, its development, its objectives, its components, models, classifications, and the advantages of its use in the teaching process at the University of Samarra, as well as to identify the most important challenges and obstacles that teachers face in using University of Samarra. The researcher u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011